Cross-lingual pronunciation modelling for indonesian speech recognition
نویسندگان
چکیده
The resources necessary to produce Automatic Speech Recognition systems for a new language are considerable, and for many languages these resources are not available. This emphasizes the need for the development of generic techniques which overcome this data shortage. Indonesian is one language which suffers from this problem and whose population and importance suggest it could benefit from speech enabled technology. Accordingly, we investigate using English acoustic models to recognize Indonesian speech. The mapping process, where the symbolic representation of the Source language acoustic models is equated to the Target language phonetic units, has typically been achieved using one to one mapping techniques. This mapping method does not allow for the incorporation of predictable allophonic variation in the lexicon. Accordingly, in this paper we present the use of cross-lingual pronunciation modelling to extract context dependant mapping rules, which are subsequently used to produce a more accurate cross lingual lexicon.
منابع مشابه
Cross Lingual Modelling Experiments for Indonesian
The extension of Large Vocabulary Continuous Speech Recognition (LVCSR) to resource poor languages such as Indonesian is hindered by the lack of transcribed acoustic data and appropriate pronunciation lexicons. Research has generally been directed toward establishing robust cross-lingual acoustic models, with the assumption that phonetic lexicons are readily available. This is not the case for ...
متن کاملMultilingual phone clustering for recognition of spontaneous indonesian speech utilising pronunciation modelling techniques
In this paper, a multilingual acoustic model set derived from English, Hindi, and Spanish is utilised to recognise speech in Indonesian. In order to achieve this task we incorporate a two tiered approach to perform the cross-lingual porting of the multilingual models to a new language. In the first stage, we use an entropy based decision tree to merge similar phones from different languages int...
متن کاملTarget Structured Cross Language Model Refinement
The task of porting Automatic Speech Recognition (ASR) technology to many languages is hindered by a lack of transcribed acoustic data, which in turn prevents the development of accurate acoustic models necessary for the recognition task. To overcome this problem, recent research has sought to exploit the similarity of sounds across languages, and use this similarity to adapt models from one or...
متن کاملTowards automatic speech recognition without pronunciation dictionary, transcribed speech and text resources in the target language using cross-lingual word-to-phoneme alignment
In this paper we tackle the task of bootstrapping an Automatic Speech Recognition system without an a priori given language model, a pronunciation dictionary, or transcribed speech data for the target language Slovene – only untranscribed speech and translations to other resource-rich source languages of what was said are available. Therefore, our approach is highly relevant for under-resourced...
متن کاملPronunciation and Acoustic Model Adaptation for Improving Multilingual Speech Recognition
In this paper, we address the importance of pronunciation and acoustic model adaptation in multilingual speech recognition. When aiming at modeling several languages simultaneously, the degree of speaker and language variability is even greater than when concentrating on only one language. To compensate the pronunciation variability across various speaker, bi-lingual pronunciation modeling is p...
متن کامل